Efficient Test and Visualization of Multi-Set Intersections.

نویسندگان

  • Minghui Wang
  • Yongzhong Zhao
  • Bin Zhang
چکیده

Identification of sets of objects with shared features is a common operation in all disciplines. Analysis of intersections among multiple sets is fundamental for in-depth understanding of their complex relationships. However, so far no method has been developed to assess statistical significance of intersections among three or more sets. Moreover, the state-of-the-art approaches for visualization of multi-set intersections are not scalable. Here, we first developed a theoretical framework for computing the statistical distributions of multi-set intersections based upon combinatorial theory, and then accordingly designed a procedure to efficiently calculate the exact probabilities of multi-set intersections. We further developed multiple efficient and scalable techniques to visualize multi-set intersections and the corresponding intersection statistics. We implemented both the theoretical framework and the visualization techniques in a unified R software package, SuperExactTest. We demonstrated the utility of SuperExactTest through an intensive simulation study and a comprehensive analysis of seven independently curated cancer gene sets as well as six disease or trait associated gene sets identified by genome-wide association studies. We expect SuperExactTest developed by this study will have a broad range of applications in scientific data analysis in many disciplines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Method for Segmentation and Visualization of Teeth in Multi-Slice CT scan Images

Introduction: Various computer assisted medical procedures such as dental implant, orthodontic planning, face, jaw and cosmetic surgeries require automatic quantification and volumetric visualization of teeth. In this regard, segmentation is a major step. Material and Methods: In this paper, inspired by our previous experiences and considering the anatomical knowledge of teeth and jaws, we prop...

متن کامل

Efficient test sites for multi-environment evaluation of sugarcane genotypes in Thailand

Multi-environment trials (METs) of crop genotypes are costly and require efficient test sites for cost effectiveness. This study aimed to identify efficient test sites for METs of sugarcane (Saccharum spp.) genotypes in Thailand, utilizing data from 10 sugarcane genotypes conducted at nine locations covering different sugarcane growing regions of the country for two crop-classes. Cluster an...

متن کامل

compare the effectiveness of teaching strategy learning and visualization and self-regulation training on student problem solving skills

Background and Aim: The purpose of this study was to compare the effectiveness of teaching strategy learning and visualization and self-regulation training on student problem solving skills. Materials and Methods: The present research was experimental. The research population consisted of all 7th grade students in Tehran during the academic year 1397-1396. Using multi-stage cluster sampling, 12...

متن کامل

A Multi-Objective Particle Swarm Optimization for Mixed-Model Assembly Line Balancing with Different Skilled Workers

This paper presents a multi-objective Particle Swarm Optimization (PSO) algorithm for worker assignment and mixed-model assembly line balancing problem when task times depend on the worker’s skill level. The objectives of this model are minimization of the number of stations (equivalent to the maximization of the weighted line efficiency), minimization of the weighted smoothness index and minim...

متن کامل

Efficient Reverse Converter for Three Modules Set {2^n-1,2^(n+1)-1,2^n} in Multi-Part RNS

Residue Number System is a numerical system which arithmetic operations are performed parallelly. One of the main factors that affects the system’s performance is the complexity of reverse converter. It should be noted that the complexity of this part should not affect the earned speed of parallelly performed arithmetic unit. Therefore in this paper a high speed converter for moduli set {2n-1, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Scientific reports

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2015